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Abstract. The Ising model is often referred to as the most studied model of 
statistical physics. It describes the behavior of ferromagnetic material at differ- 
ent temperatures. It is an interesting model also for mathematicians, because 
although the Boltzmann distribution is continuous in the temperature parameter, 
the behavior of the usual single-spin dynamics to sample from this measure varies 
extremely. Namely, there is a critical temperature where we get rapid mixing 
above and slow mixing below this value. Here, we give a survey of the known 
results on mixing time of Glauber dynamics for the Ising model on the square 
lattice and present a technique that makes exact sampling of the Ising model at 
all temperatures possible in polynomial time. At high temperatures this is well- 
known and although this seems to be known also in the low temperature case 
since Kramer and Waniers paper [KW41] from the 1950s, we did not found any 
reference that describes exact sampling for the Ising model at low temperatures. 



1. Introduction 

In this article we summarize the known results about the mixing time of the heat 
bath dynamics for the Ising model and combine them with some graph theoretic 
results to an algorithm to sample exactly from the Ising model in polynomial time. 
By time (or running time) we always mean the number of steps of the underlying 
Markov chain. The algorithm that will be analyzed (Algorithm 2, given in Section 
|5]) is at high temperatures simply the Coupling from the past algorithm (see Propp 
and Wilson |PW96j ). At low temperatures we have to produce a sample at the dual 
graph, but this can be traced back to sampling on the initial graph with constant 
boundary condition. 

The main theorem of this article is stated as follows. 
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Theorem [SJ. Let Gl be the square lattice with N = L 2 vertices. Then, Algorithm 
2 outputs an exactly distributed Ising configuration with respect to ir j3 L in expected 
time smaller than 

• cp N (log N) 2 for ^ (3 C = log(l + V2) and some cp > 

• 16 iV^logiV for (3 = (3 C , where C is given in ([I]). 

As a consequence we get that one can estimate the expectation of arbitrary functions 
with respect to the Boltzmann distribution in polynomial time. Namely, if we use 
the simple Monte Carlo method to approximate the expectation of a function / 
on the Ising model, we need e -2 ||/||2 exact samples from Kp (i.e. Algorithm 2) to 
reach a mean square error of at most e. Therefore, if we denote the bounds from 
Theorem [S] by Tp, we need on average Tp e~ 2 \\f — E W0 /||| steps of the Markov chain 
that will be defined in Section [2] 

The first polynomial-time algorithm (FPRAS) was shown by Jerrum and Sinclair 
[JS93]. There they present an algorithm to approximate the partition function Zp 
and, as a consequence, approximate expectations of functions that are given in terms 
of the partition function in polynomial time at all temperatures /3. 



2. The Ising model 
In this section we introduce the two-dimensional Ising model. 

Let G = (V, E) be a graph with finite vertex set V a 7? and edge set E = 
{{w, v } e (X) : \u — v | = l}, where (X) is the set of all subsets of V with 2 elements. 
From now, N := \V\. We are interested in the square lattice, i.e. V = {1, . . . , L} 2 
for some L = G N, because it is the most widely used case. We denote the 
induced graph by Gl- 

The Ising model on Gl is now defined as the set of possible configurations Q\s = 
{ — 1, l}^, where a G Qis is an assignment of -1 or 1 to each vertex in V, together 
with the probability measure 

irp(a) := tt°» = ^expj/3 ]T 1 (a(u) = a(v)) I , 

P L U,V'.U4-±V ) 

where u •f-)- v means u and v are neighbors in Gl, Z is the normalization constant and 
(5 > is the called the inverse temperature. This measure is called the Boltzmann 
(or Gibbs) distribution with free boundary condition. 

Additionally we need the notion of boundary conditions, but we restrict ourself here 
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to the "all plus" and "all minus" case. 

Let V c = Z 2 \ V. Then we denote the lattice Gl together with the probability 
measure 



» == flf'V) = i ^» • exp i /3 £ 1 (<r( w ) = ±l) 



MUD 



by the Ising model with plus/minus boundary condition, respectively. One can 
imagine that this corresponds to the Ising model on Gl with a strip of fixed spins 
around, so every vertex in Gl has the same number of neighbors. 
In 1944 Onsager [Ons44j proved that there is a phase transition at (5 = (3 C : = 
ln(l + v^2) in the case where V = Z 2 and we will see that this value is also important 
for finite lattices. Namely, the dynamics that will be defined below is rapidly mixing 
if and only if /3 < /3 C . 

We will use the so called heat bath dynamics. These dynamics define a irreducible, 
aperiodic and reversible Markov chain X 13 = (Xf )^ with stationary distribution 
up by the transition matrix 

Pi*,*"*) = I (i + ^¥LY\ *en ls ,veV, 



N V K/3(* V4 ) 

where a v * with £ e { — 1,1} is defined by a v *{v) = £ and a v *(u) = *(u), u ^ v. 
The interpretation of this algorithm is very simple. In each step choose a random 
v e V and assign a new value to v according to 7Tp conditioned on all the neighbors 
of v. 

Note that the results of this article hold in general for all Glauber dynamics as 
defined in [Gla63] that admit a monotone coupling (see Section [3]). For a general 
introduction to Markov chains see e.g. [LPW09J, or [Mar99j in the context of spin 
systems. 

In the sequel we want to estimate how fast such a Markov chain converges to its 
stationary distribution. Therefore we first introduce the total variation distance to 
measure the distance between two probability measures v and ir, which is defined 
by 



5 E M°) 



|// - -|| TV = - 2^ \ V K<T) - 
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Now we can define the mixing time of the Markov chain with transition matrix P 
and stationary distribution irp by 

t p = min In : max ||P n (<7, ■) - 7r^(-)|| TV < — }• 
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This is the expected time the Markov chain needs to get close to its stationary 
distribution. In fact, one can bound the spectral gap of the transition matrix P in 
either direction in terms of the mixing time, see e.g. [LPW091 Th. 12.3 & 12.4], so 
one can bound the error of a MCMC algorithm to integrate functions over fijg, as 
one can read in [Rud09] . Furthermore, if the Markov chain is rapidly mixing (i.e. 
the mixing time is at most polylogarithmic in the size of the state space ilis) we get 
that the problem of integration (with an unnormalized density) on the Ising model 
is tractable, see also |NW10j . Unfortunately, there is no Markov chain that is proven 
to be rapidly mixing at all temperatures. 

However, in this article we are interested in sampling exactly from the stationary 
distribution, but first we present the known mixing time results for the Glauber 
dynamics for the Ising model. For proofs or further details we refer to the particular 
articles or the survey of Martinelli |Mar99j . Of course, we can only give a small 
selection of references, because there are many papers leading to the results given 
below. 

Theorem 1. [MO 94] Let j3 < (3 C . Then there exists a constant cp > such that the 
mixing time of the Glauber dynamics for the Ising model with arbitrary boundary 
condition on Gl satisfies 

T(3 < Cp NlogN. 

Theorem 2. |OGMS96] Let (3 > j3 c . Then there exists a constant cp > such that 
the mixing time of the Glauber dynamics for the Ising model on Gl satisfies 

7/3 > e c ? N . 

The results above can be obtained by the observation that some spatial mixing 
property of the measure irp is equivalent to the mixing in time of the Glauber 
dynamics. For details for this interesting fact, see [DSVW 04J . 
The constant cp of Theorem [1] is widely believed to be of order ■gz^-- To determine 
the mixing time in the case (3 = (3 C was a challenging problem for a long time. It 
was solved by Lubetzky and Sly in their recent paper |LS10j . 

Theorem 3. [LSlOj There exists a constant C > such that the mixing time of the 
Glauber dynamics for the Ising model on Gl at the critical temperature satisfies 

t p < 4:N C . 



EXACT SAMPLING FOR THE ISING MODEL 



5 



Remark 4. We give here only a brief description of the constant C, which can be 
given explicitly. For more details see |LS10|. p. 19]. 

However, numerical experiments on the "true" exponent suggest that C ~ 3.08 (see 
e.g. [WHS95J, [NB96] and note the explanation below). 
The constant C in Theorem [3] is given by 

(1) C = 2 + l 0g3/2 ( T ^- T ). 

Here, p + is the limiting vertical crossing probability in the random cluster model 
on a fully-wired rectangle, where the width of the lattice is 3 times its height. The 
C, as given here, differs from the one given in [LS10] by eliminating a factor of 2 in 
front of the log term and by the additional 2. The reason is that we state their result 
in terms of N and not in the side-length L of the lattice (therefore without factor 2) 
and that we are interested in the discrete time single-spin algorithms. Therefore we 
get an additional factor N in their spectral gap result ( |LS10[ Th. 1]) and a factor 
N by (see e.g. |LPW09] ) 

rp < log ( . 6 ) gap^)- 1 < 4iVgap(X^)- 1 , 

because min .7r i g c (cr) > exp(— 3N). 

The results of this section show that the Glauber dynamics is rapidly mixing for 
< Pc but very slowly mixing for larger /3. In Section H] we will see how to avoid 
this problem. 

3. Exact sampling 

In this section we briefly describe the so called Coupling from the past algorithm 
(CFTP) to sample exactly from the stationary distribution of a Markov chain. 
This algorithm works under weak assumptions on the Markov chain for every finite 
state space and every distribution, but to guarantee that the algorithm is efficient we 
need some monotonicity property of the model and that the chain is rapidly mixing. 
For a detailed description of CFTP and the proof of correctness see [PW96J . 
We restrict ourself to the heat bath dynamics for the Ising model. First note that 
the heat bath dynamics, as defined above, admits a monotone coupling, that is, 
given two realizations of the heat bath chain X = (X t )ten and Y = (Y t )ten, there 
exists a coupling (X, Y) (i.e. using the same random numbers) such that 

X t < Y t =► X t+1 < Y t+1 for all t G N, 
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where < means smaller or equal at each vertex. 

Additionally we know that —1 < o < 1 for all a G fiis, where —1 = (— l) v and 
1 = Therefore if we set Xo = —1 and Yq = 1 we know that Xo < a < Yo 

for all a and so X t < Z t < Y t for the realization Z = (Z t ) te ^ with Z = a. Since 
this holds for all a, one can choose Z ~ 7r^ and we get that whenever X t and Y t 
coalesce, they also coalesce with Z t which has the right distribution. 
After we presented the idea of the algorithm, we state the algorithm in detail. Note 
that the algorithm is called Coupling from the past, because we run the chains from 
the past to the present. The algorithm CFTP(G, /3) to sample from the distribution 
7r9 works as described in Algorithm 1. 



Algorithm 1 Coupling from the past 

Input: The graph G = (V,E) and the value of /3 
Output: An Ising configuration a ~ ftp 

1: procedure CFTP(G,/3) 

2: Set t = 

3: Set X = -1 and Y = 1 

4: while X 7^ Y do 

5: t = t+l 

6: Generate random numbers C/_2*+i, • • • , U_ 2 t~i that are 

sufficient to run the Markov chain, 
(e.g. Ui ~ Uniform {V x [0, 1]}) 

7: Set X_ 2 t+i = and K^'+i = 1 an d run the chains until 

time by using only the random numbers £/_2*+i, • • • , 
8: end while 

9: return a = X 

10: end procedure 



We denote the algorithm by CFTP ± (G, (5) if we sample with respect to 7r^, i.e. with 
plus/minus boundary condition. 



See |Hag02| for examples that show that it is necessary to go from the past in the 



future and that we have to reuse the random numbers. 

Now we state the connection between the expected running time of the CFTP al- 
gorithm and the mixing time of the Markov chain. 
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Proposition 5. [PW96J Let Tg be the expected running time of CFTP(G, f3) from 
Algorithm 1 with G = (V, E) and \V\ = N. Then 

Tp < At? log TV, 

where Tp is the mixing time of the underlying Markov chain. 

We see that exact sampling from the Boltzmann distribution is efficient whenever 
the Markov chain is rapidly mixing. By the results of Section [2] we know that this 
is the case for < f3 c . In the case (3 > (3 C we need a different technique to generate 
exact samples. Therefore we need essentially the so called random cluster model, as 
we will see in the next section. 

4. The random cluster model 

The random cluster model (also known as the FK-model) was introduced by Fortuin 
and Kasteleyn in |FK72] to study lattice spin systems with a graph structure. It is 
defined on a graph G = (V, E) by its state space Q-rc = W '■ w C E} and the RC 
measure 

f, p (u) = I p H(l-p)l*l-M 2 oH j 

where p G (0, 1), Z is the normalization constant and C(u) is the number of con- 
nected components in the graph (V, lo). For a detailed introduction and related 
topics see the book [Gri06j. 

There is a tight connection between the Ising model and the random cluster model. 
Namely, if we set p = 1 — e _/3 , we can translate an Ising configuration a ~ ftp 
to a random cluster state u ~ [i. p and vice versa. To get an Ising configuration 
a G Q\s from to G £7rc assign independent and uniformly random spins to each 
connected component of lo. For the reverse way include all edges e = {ei,e2} G E 
with <r(ei) = o"(e 2 ) to u> with probability p. For details see [ES88J. 
Therefore sampling an Ising configuration according to Tip is equivalent to sampling 
a RC state from /i p whenever both models are defined on the same graph G and 
p = 1 — e _/3 . 

Another important concept in connection with the RC model is the duality of graphs 
(see e.g. |Cril0j ). Let G = (V, E) be a finite, planar graph, i.e. without intersecting 
edges if we draw it in the plane (like our Gl). The dual graph G* = (V*, E*) of G is 
constructed as follows. Put a vertex in each face (including the infinite outer one) 
of the graph and connect 2 vertices by a edge if and only if the corresponding faces 
of G share a boundary edge. It is clear, that the number of vertices can differ in the 
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dual graph, but we have the same number of edges. 

Additionally we define a dual configuration to* C E* in G* to a RC state u C E in 
Gby 

e G u; •<=>- e f. oj , 

where e* is the edge in i£* that "crosses" e. (By the construction, this edge is 
unique.) See Figured] for the graph Gl with L = 3 and its dual graph G*2 together 
with 2 corresponding RC states. 
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Figure 1. Left: The graph G 3 (solid) and its dual (dashed). Right: 
A RC state on G3 (solid) and its dual configuration (dashed) 

Now we can state the following theorem about the relation of the distribution of a 
RC state and its dual, see [CrilO] . 



Proposition 6. |CrilO[ p. 164] Let G = (V, E) be a finite, planar graph and \i v be 
the random cluster measure on G. Furthermore let G* = (V*,E*) be the dual graph 
of G and Hp* be the random cluster measure on G* . 
Then 



where 
(2) 



w ~ Hp 



P 



P 



2-p 



Obviously, (p*)* = p. By Proposition E] one can see that sampling from /i p and sam- 
pling from fip* is equivalent. It is straightforward to get the following Proposition. 



Proposition 7. Sampling from the Boltzmann distribution is equivalent to sam- 
pling from the Boltzmann distribution tv9* , where 



(3) 



log ( coth 
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Additionally, 

P > /3 C ^ /3* < /3 C . 

Proof. The equivalence was shown by the above procedure, i.e. if we want to sample 
from 7r^, we can sample from n^* generate a RC state with respect to //**, go to 
the dual lattice with measure \i v and finally generate a state according to tt^. Since 
p(*) = 1 — * , the formula for p* comes from 

P* = -log(l-p*) i lQ g(^p) = log(coth0. 
This proves the statement. □ 



5. Efficient sampling for the Ising model 



In this section we show an efficient algorithm to sample exactly from the Boltzmann 
distribution. But, before we prove that it is efficient, we state our sampling algo- 
rithm. 

Therefore we first have to explain how the graph G* L looks like. It is easy to obtain 
(see Figured]) that G* L = (V£,E^) is also a square lattice with (L — l) 2 vertices 
and an additional auxiliary vertex v*, which is connected to every vertex on the 
boundary of it. We denote the operation of adding a vertex to a graph and connect 
it to all boundary vertices by U&. So G* L = Gl-i U& v*. 

Algorithm 2 Sampling from the Ising model on the square lattice 
Input: An integer L and the value of 

G 

Output: An Ising configuration a ~ iTp h 

l: if p < p c then 

2: a = CFTP(C7 L ,/3) 
3: else 

4: a = CFTP + (C7 L _i, P*), where P* is given in © 

5: Define a Ising configuration a* on G* L = Gl-i U& v* by 

a*(v) = a(v) on V(G L ^) and a*(v*) = 1. 
6: Generate a RC state u* from a* 
7: Take the dual RC state u = (cj*)* 
8: Generate an Ising configuration a from uj 

9: end if 

10: return o 
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Theorem 8. Let Gl be the square lattice with N = L 2 vertices. Then, the algorithm 
from above outputs an exactly distributed Ising configuration with respect to tt^ l in 
expected time smaller than 

• cpN (log Nf for f3 ^ f3 c = log(l + V2) and some cp > 

• 16 iV^logiV for (3 = (3 C , where C is given in ([T]). 



Proof. The running time of the algorithm follows directly from Theorems [T] and [3] 
and Prop. [5j Therefore we only have to prove that the output a of the algorithm 
has the right distribution. In the case of /3 < (3 C this is obvious. For (3 > f3 c we 
know from Proposition [7] that a ~ tt^ l , if the dual configuration a* on G* L (line 5 
of Algorithm 2) is distributed according to 7r^» := itgf. But by the construction of 
lines 4 and 5 of Algorithm 2, this is true. For this, note that 7173(77) = 7Tg(— 77) for all 
77 G Qis- We get that for each vertex v G V (especially for v*) 

7173(7?) = 7173(77 H {a: a{v) = 1}) + tt^ (77 n {a : a(v) = -1}) 

= ir p ({a:a{v) = l}) 7173(77 1 {a: a(v) = 1}) 

+ ftp {{<?'■ cr(w) = -1}) 77/3(77 | {°" : °{ v ) = 

= g 7r /3(^K (T: °"( u ) = 1 i) + ^(77 1 ^(^) = — !}) 
= \np({ri,-ri}\{o-: a(v) = 1}). 
The last equality comes from the fact that 

np{v\{<r''<r(v) = -i}) = ^p{~V \ W- <r(v) = 1}). 

Therefore we can sample from 7173 on G* L by sampling 77 from the conditional measure 
7173 (■ I {a: <j(v*) = 1}) and then choose with probability | either 77 or —77. If we 
now use that = Gl-i U& t>* one can see that sampling on G* L with respect to 
7173 (■ I {a: cr(v*) = 1}) is the same as sampling a from itg ~ u and setting 



a{v) 




v G y(G L _! 



Note that we omit the step of choosing a or —a with probability |, because the RC 
state that will be generated would be the same. 
This completes the proof. 

□ 
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Remark 9. Note that the same technique works also for the g-state Potts model. 
This model consists of the state space f2p = {1, . . . , q} v and the same measure irp. 
In this case we consider the random cluster measure 

and the connection of the models is again given by p = 1 — e - ^. 
A recent result of Beffara and Duminil-Copin [BP 10] shows that the self-dual point 
of the RC model corresponds to the critical temperature of the Potts model (3 c (q) = 
ln(l + y/q) in the same way as in the case q = 2 (i.e. the Ising case). Therefore, a 
sampling algorithm for the Potts model above (and at) the critical temperature is 
enough to sample at all temperatures. 
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